Materials to reproduce main analysis reported in "Transforming Stability into Change: How the Media Select and Report Opinion Polls". Raw media documents cannot be shared publicly, hence text analysis part is not included in the script. The data file is supplied for the sole purposes of reproducing the materials in the paper. Please get in touch with the authors regarding data use for different purposes and original files, we are happy to help.

Some of the data is being used for other projects as well, please get in touch.

Contents:

1) ijpp-polls.R:           R script to produce the results. No automated export of tables/figures.
2) polls.csv:              poll level data with mention counts.
3) polls-hierarchical.csv: stacked data file with article x polling company dyad grouping for hierarchical models.
4) polls-content.csv:      article data file with content specific outcome variables included.

Variables:

1) polls.csv

"poll": 				poll id.              
"poll_date": 			when the poll was carried out.         
"pollingfirm": 			polling company name.       
"volat_own_prev": 		volatility comapred to previous poll (same company).    
"n_significant": 		number of significant (0.05) changes comapred to previous poll (same company).    
"any_significant": 		any significant change (from n_significant), 0 or 1.   
"max_change": 			maximum change compared to previous poll (same company).        
"mentions_main": 		number of mentions a poll had.     
"is_mention_main": 		any mention of the poll (from mentions main), 0 or 1.   
"e_proximate": 			is it close to an election (EP or local election only), 0 or 1.       
"days": 				days since previous poll.              
"days_sd": 				days since previous poll, mean centered and divided by two standard deviations.          
"volat_own_prev_sd": 	volatility,  mean centered and divided by two standard deviations. 
"max_change_sd": 		maximum change,  mean centered and divided by two standard deviations.
"month_effect": 		month (with year) when poll was carried out.
"year": 				year when poll was carried out.             

2) polls-hierarchical.csv

Same as above, with the following differences/additions:

* mention and is_mention stands for mentions_main and is_mention_main.

"media": name of media outlet.
"partners": 1 if media outlet is partner with the polling organization, 0 otherwise.
"dyad_id": combination of polling firm and media name (id variable).

3) polls-content.csv:

"doc_id:" 				document id.               
"row_id": 				combined id, for raw data merge (not used). 
"is_title_change": 		does title make a change reference (1), or not (0). 
"text_isuncertain": 	does main text make a reference to uncertainty (1), or not (0). 
"expert_quote":			does main text quote experts (1), or not (0). 


